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ABSTRACT 



A method of processing an image including the steps of: 
locating within the image the position of at least one 
predetermined feature; extracting from the image data rep- 
resenting each feature; and calculating for each feature a 
feature vector representing the position of the image data of 
the feature in an N-dimensional space, such space being 
defined by a plurality of reference vectors each of which is 
an eigenvector of a training set of like features in which the 
image data of each feature is modified to normalize the 
shape of each feature thereby to reduce its deviation from a 
predetermined standard shape of the feature, which step is 
carried out before calculating the corresponding feature 
vector. 

11 Claims, 3 Drawing Sheets 
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NORMALIZED IMAGE FEATURE generate a verification signal in dependence upon the result 

PROCESSING of the comparison. 

However, despite the ease with which humans "never 

BACKGROUND OF THE INVENTION forget a face", the task for a machine is a formidable one 

. ~ * <u t 5 because a person's facial appearance can vary widely in 

1. Field of the Invention respects over time because the eyes and mouth, in the 

This invention relates to a method of processing an image of facial processing, for example* are mobile, 
and particularly, but not exclusively, to the use of such a 

method in the recognition and encoding of images of objects SUMMARY OF THE INVENTION 

such as faces. to . 

2 Related Art According to a first aspect of the present invention a 

is in automatic identity verification techniques forrestneted * ^ f ^ ^ ^ feduoe i(s from a 

access buildings or fund ^SS^ 15 ^terrnined standard shape of said feature, which step is 

manner discussed in our UK apphcation GB9005 190.5. In earned out before calculating the corresponding feature 

many such fund transfer transactions a user carries a card ™™ 6 ^ 

which includes machine-readable data stored magnetically, vector. 

dectrically or optically. One particular application of face We have found mat recognition accuracy of images of 

recognition is to>event the use of such cards by unautho- faces, for example, can be improved greatly by such & 

rised personnel by storing face identifying data of the correct 20 modifying step which reduces the effects of a persons s 

user on the card, reading the data out obtaining a facial changing facial expression. 

image of the person seeking to use the card by means of a In the case of an image of a face, for example, the 

camera, analyzing the image, and comparing the results of predetermined feature could be the entire face or a part of it 

the analysis with the data stored on the card for the correct such as the nose or mouth. Several predeterrnined features 

uscr> 23 may be located and characterised as vectors in the corre- 

The storage capacity of such cards is typically only a few sponding space of eigenvectors if desired, 

hundred bytes which is very much smaller than the memory n will be clear to those skilled in this field that the present 

space needed To store a recognisable image as a frame of invention is applicable to processing images of objects other 

pixels. It is therefore necessary to use an image processing than the faces of humans notwithstanding that the primary 

technique which allows the image to be characterised using application envisaged by the applicant is in the field of 

the smaller number of memory bytes. human face images and that the discussion and specific 

Another application of image processing which reduces examples of embodiments of the invention are directed to 

the number of bytes needed to characterise an image is in such images. 

hybrid video coding techniques for video telephones as 35 The invention also enables the use of fewer eigen- 

disclosed in our earlier filed application published as U.S. pictures, and hence results in a saving of storage or of 

Pat No. 4,841.575. In this and similar, applications the transmission capacity. 

perceptually important parts of the image are located and the Further, by modifying the shape of the feature towards a 

available coding data is preferentially allocated to those standard (topologicaUy equivalent) feature shape, the accu- 

parts. 40 racy with which the feature can be located is improved, 

A known method of such processing of an image com- preferably the training set of images of like features are 

prises the steps of: locating within the image the position of modificd to normalisc me shape of each of the training set 

at least one predetermined feature; extracting image data rf . ^ ^. tQ wducc ^ fa^^ from a prede- 

from said image representing each said feature; and calcu- tended standard shape of said feature, which step is carried 

lating each feature a feature vector representing the position 45 out before calculating the eigenvectors of the training set of 

of the image data of the feature in an N-dunensionai space, imams 

said space being defined by a plurality of reference vectors ~JT ' ^ * . _^ , . *. * . , 

etch of which I an dgenvector of a Lining set of images ^ method is useful not only for object recognition but 

~t ru ZZ. also as a hybrid coding technique in which feature position 

or nice teatures. feature representative data (the N-dimcnsionai 

The Karhunen-Loeve transform (KIT) is weU toown in 50 y m tTmm ^ ed to a recciv er where an image is 

the signal processing art for various appkeauons. It has been by coining the eigen-pictures corresponding 

proposed to apply this transform to identification of human ^Ae imaee vector 

faces (Sirovitch and Kirby, J. Opt. See. Am. A vol 4 no 3, pp ^ ' _ . ..^^ • • 

519-524 "Low Dimensional Procedure for the Characteri- Eigen-pictures provide a means by which the variation in 

sation of Human Faces", and IEEE Trans on Pattern Analy- 55 a set of related images can be extracted and used to represent 

sis and Machine Intelligence Vol 12, no 1 pp 103-108 those images and others like them. For instance, an eye 

"AroUcation of the Karhunen-Loeve Procedure for the image could be economically represented in terms of best 

Characterisation of Human Faces"). In these techniques, coordinate system •agen-eyes . 

images of substantially the whole face of members of a The eigen-pictures themselves are determined from a 

reference population were processed to derive a set of N 60 training set of representative images, and are formed such 

eigenvectors each having a picture-like appearance (eigen- that the first dgen-picture embodies the maximum variance 

pictures, or caricatures ). These were stored. In a subsequent between the images, and successive eigen-pictures have 
recognition phase, a given image of a test face (which need monotonically decreasing variance. An image in the set can 

not belong to the reference population) was characterised by then be expressed as a series, with the dgen-pictures effec- 

its image vector in the N-dimensional space defined by the 65 tively forming basis functions: 
eigenvectors. By comparing the image vector, or data 

derived therefrom, with the identification data one can t=M*wfi*wfi*. • • 
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where 

M=mean over entire training set of images 
wpcomponent of the i'th eigen-picture 
P^'th eigen-picture, of m, 
I=original image 

If we truncate the above series we still have the best 
representation we could for the given number of eigen- 
pictures. in a mean-square-erxor sense. 

The basis of eigen-pictures is chosen such that they point 
in die directions of maximum variance, subject to being 
orthogonal. In other words, each training image is consid- 
ered as a point in n-dimensional space, where V is the size 
of the training images in pels; eigen-picture vectors are then 
chosen to lie on lines of maximum variance through the 
clusters) produced. 

Given training images I t I„,, we first form the mean 

image M, and then thee difference images (a. k. a. 
'caricatures') D^I^-M. 

The choosing of the basis of eigen-pictures to point in the 
directions of maximum variance is equivalent to choosing 
our eigen-picture vectors P* such that 

wilh/ , l T P 4 = 0 ( i<A 

The eigen-pictures P 4 above are in fact the eigenvectors of 
a very large covariance matrix, the solution of which would 
be intractable. However, the problem can be reduced to more 
manageable proportions by forming the matrix L where 

and solving for the eigenvectors v* of L. 
The eigen-pictures can then be found by 

n = I(vvDy) 



The term 'representation vector* has been used to refer to the 
vector whose components (w,) are the factors applied to each 
eigen-picture (PJ in the series. That Is 

The representation vector equivalent to an image I is 
formed by taking the inner product of Ts caricature with 
each eigen-picture: 

Wf4i-MfP b for iS+fSm 

Note mat a certain assumption is made when it comes to 
representing an image taken from outside the training set 
used to create eigen-pictures; the image is assumed to be 
sufficiently 4 similar 1 to those in training set to enable it to be 
well represented by the same eigen-pictures. 

The representation of two images can be compared by 
calculating the Euclidean distance between them: 

Thus, recognition can be achieved via a simple threshold, 
where d^<T means recognised or the value of d^ can be used 
as a sliding confidence scale. 

Deformable templates consist of parametrically defined 
geometric templates which interact with images to provide 
a best fit of the template to a corresponding image feature. 
For example, a template for an eye might consist of a circle 
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for the iris and two parabolas for the eye/eyelid boundaries, 
where size, shape and location parameters are variable. 

An energy function is formed by integrating certain image 
attributes over template boundaries, and parameters are 

s iteratively updated in an attempt to minimise this function. 
This has the effect of moving the template towards the best 
available fit in the given image. 

The location (within the image) of the position of at least 
one predetermined feature may be found using a first tech- 
nique to provide a coarse estimation of position and a 
second, different, technique to improve upon the coarse 
estimation. The second technique preferably involves the 
use of such a deformable template technique. 

The deformable template technique requires certain fil- 
tered images in addition to the raw image itself, notably 

15 peak, valley and edge images. Suitable processed images 
can be obtained using morphological filters, and it is this 
stage which is detailed below. 

Morphological filters are able to provide a wide range of 
filtering functions including nonlinear image filtering, noise 

20 suppression, edge detection, skeletonization, shape recogni- 
tion etc. All of these functions are provided via simple 
combinations of two basic operations termed erosion and 
dilation. In our case wc are only interested in valley, peak 
and edge detection. 

25 Erosion of grayscale images effectively causes bright 
areas to shrink in upon themselves, whereas dilation causes 
bright areas to expand. An erosion followed by a dilation 
causes might peaks to be lost (operator called 'open'). 
Conversely, a dilation followed by an erosion causes dark 

30 valleys to be filled (operator called 'close'). For specific 
details see Maragos P, (1987), 'Tutorial on Advances in 
Morphological Image Processing and Analysis", Optical 
Engineering. Vol 26. No. 7. 
In image processing systems of the kind to which the 

35 present invention relates, it is often necessary to locate the 
object, eg head or face, within the image prior to processing. 

Usually this is achieved by edge detection, but traditional 
edge detection techniques are purely local— an edge is 
indicated whenever a gradient of image intensity occurs — 

40 and hence will not in general form an edge that is completely 
closed (ie. forms a loop around the head) but will instead 
create a number of edge segments which together outline or 
partly outline the head. Post-processing of some kind is thus 
usually necessary. 

45 We have found that the adaptive contour model or 
"snake", technique is particularly effective for this purpose. 
Preferably, the predeterrnined feature of the image is located 
by detennining parameters of a closed curve arranged to lie 
adjacent a plurality of edge features of the image, said curve 

so being constrained to exceed a minimum curvature and to 
have a minimum length compatible therewith. The boundary 
of the curve may be initially calculated proximate the edges 
of the image, and subsequently interactively reduced. 
Prior to a detailed description of the physical embodiment 

ss of the invention, the 'snake* signal processing techniques 
mentioned above will now be described in greater detail. 

Introduced by Kass et al [Kass m, Witkin A, Terpozopou- 
lus d. "Snakes: Active Contour Models", International Jour- 
nal of Computer Vision, 321-331, 1988], snakes are a 

60 method of attempting to provide some of the post-processing 
that our own visual system performs. A snake has built into 
it various properties mat are associated with both edges and 
the human visual system (Eg continuity, smoothness and to 
some extent the capability to fill in sections of an edge that 

65 have been occluded). 

A snake is a continuous curve (possibly closed) that 
attempts to dynamically position itself from a given starting 
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position in such a way that it 'dings' to edges in the image. 
The form of snake that will be considered here consists of 
curves that are piecewise polynomial. That is, the curve is in 
general constructed froraN segments Ws),yXs)}i=l, 
N where each of the x£s) and y/s) are polynomials in the 
parameter s. As the parameter s is varied a curve is traced 
out 

From now on snakes will be referred to as the parametric 
curve u(s)=(x(s),y(s)) where s is assumed to vary between 0 
and 1. What properties should an 'edge hugging 1 snake 
have? 

The snake must be 'driven* by the image. That is, it must 
be able to detect an edge in the image and align itself with 
the edge. One way of achieving this is to try to position the 
snake such that the average 'edge strength*, (however that 
may be measured) along the length of the snake is maxi- 
mised. If the measure of edge strength is F(x,y)£0 at the 
image point (x,y) then this amounts to saying that the snake 
u(s) is to be chosen in such a way that the function 



x=0 



nut)***))** 



is maximised. This will ensure that the snake will tend to 
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the objects in an image. To model this behaviour the 
parametric curve for the snake is chosen so that the function 
(2) tends to be minimised. If in addition the forcing term of 
function (1) were included then the snake would be pre- 
vented from contracting through objects* as it would be 
attracted toward their edges. The attractive force would also 
tend to pull the snake into the hollows of a concave 
boundary, provided that the restoring 'elastic force' was not 
too great. 

One of the properties of the edges that is difficult to model 
is their behaviour when they can no longer be seen. If one 
were looking at a car and a person stood in front of it, few 
would have any difficulty imagining the contours of the edge 
of the car that were occluded. They would be 'smooth* 
extensions of the contours either side of the person. If the 
above elastic band approach were adopted it would be found 
that the band formed a straight line where the car was 
occluded (because it tries to minimise energy, and thus 
length in this situation). If however the band had some 
stiffness (that is a resistance to bending, as for example 
displayed by a flexible bar) then it would tend to form a 
smooth curve in the occluded region of the image and be 
tangential to the boundaries on either side. 

Again a flexible bar tends to forma shape so that its elastic 
energy is minimised. The elastic energy in bending is 



10 



15 



(t) 



20 



mould itself to edges in the image if it finds them, but does 25 dependent on the curvature of the bar, that is the second 



not guarantee that it will find them in the first place. Given 
an image, the function (1) may have many local minima- 
when viewed as a static problem* Finding them is where the 
'dynamics* arise. 

An edge detector applied to an image will tend to produce 30 
an edge map consisting of mainly thin edges. This means 
mat the edge strength function tends to be zero at most 
places in the image, apart from on a few lines. As a 
consequence a snake placed some distance from an edge 
may not be attracted Towards the edge because the edge 35 
strength is effectively zero at the snakes initial position. To 
help the snake come under the influence of an edge, the edge 
image is blurred to broaden the width of the edges. 

If an elastic band were held around a convex object and 



derivatives. To help force the snake to emulate this type of 
behaviour the parametric curve u(s)=(x(s),y(s)) is chosen so 
it tends to minimise the function 



(3) 



which represents a pseudo-tending energy term. Of course, 
if a snake were made too stiff it would be difficult to force 
it to conform to highly curved boundaries under the action 
of the forcing term of function (1). 

Three desirable properties of snakes have now been 
identified. To incorporate all three into the snake at once, the 
then let go. the band would contract until the object pre- 40 Pf""^™? U ( S H?00'£ S » representing the snake is 
vented it from doing so further At this point the band would cfaosen so that it minuses the function 
be moulded to the object thus describing the boundary. Two 
forces are at work here. Firstly that of providing the natural 
tendency of the band to contract; secondly that of providing 
the opposing force provided by the object The band con- 45 
tracts because it tries to minimise its elastic energy due to 
stretching. If the band were described by the parametric 
curve u(s)=(J(s).y(s)) then the elastic energy at any point S 
is proportional to 

so 



(4) 



!'.J-*U*)'<*)')' 

*•>[(-£- )'*(-*-)']-'<*>*»}•» 



(du 



;)<MXM) 



That is, the energy is proportional to the square of how 55 
much the curve is being stretched at that point The elastic 
energy along its entire length, given the constraint of the 
object, is minimised. Hence the elastic band assumes the 
shape of the curve u(s)=(x(s).y(s)) where the u(s) minimises 
the function ^ 



(2) 



subject to the constraints of the object We would like closed 65 
snakes to have analogous behaviour. That is, to have a 
tendency to contract but to be prevented from doing so by 



Here the terms a(s)>0 and fi(s)^0 represent respectively 
the amount of stiffness and elasticity that the snake is to 
have. It is clear that if the snake approach is to be successful 
then the correct balance of these parameters is crucial. Too 
much stiffness and the snake will not correctly hug the 
boundaries; too much elasticity and closed snakes will be 
pulled across boundaries and contract to a point or may even 
break away from boundaries at concave regions. The nega- 
tive sign in front of the forcing term is because minimising 
-jF(x,y)ds is equivalent to maximising fF(x.y)s. 

As it stands, minimising the function (4) is trivial If the 
snake is not closed then the solution degenerates into a 
single point (x(s).y(s)>=constant, where the point is chosen 
to minimise the edge strength F(x(s),y(s)). Physically, this is 
because the snake will tend to pull its two end points 
together in order to minimise the elastic energy, and thus 
shrink to a single point The global minimum is attained at 
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the point in the image where the edge strength is largest 

prevent this from occurring it is necessary to fix the positions {(*/)ffiwhcre * = (/ - l)k 

of the ends of the snake in some way. That *boundary 
conditions' are required. It turns out to be necessary to fix 

mare than just the location of the end points and two further 5 The method seeks a set of approximations 
conditions are required for a well posed problem. A conve- 
nient condition is to impose zero curvature at each end point {(*oti&° {WiXrtiWffi 
Similarly, the global minimum for a closed-loop snake 

occurs when It contracts to a single point However, in m _ . ^ 

™2s?TJf^l* snake. addSon^boundary condi- 10 by replacing the ^erential equations (6) and 7) in *e 
Zns cannot be applied to eliminate the degenerate solution. contmuousvariables w^ aset of d^erence yuM&to 
THe degenerate^Uition in this case is the true global discrete variables [Keller HB., And. ] Replacing the denva- 

tives in (6) by difference approximations at the point s, gives 

Qeariy me ideal situation is to seek 13 . Cj*a-a* 0 +jo) P) 

the locality of the initial position of the snake. In practice the -— | cg-i - 

problem that is solved is weaker than this: find a curve 

u(sH*(s).y(s)) e H 2 [0.1]xH 2 [(U] such that _ (^-iq+g-Q + ^ (*-2«m+^) j + 

(5) 20 

* U - II * M H & S J + 

Here H 2 [0,1] denotes the class of real valued functions j_ dF I =0 ;fori=34 jr-2 

defined on [0,1] that have "finite energy 1 in the second 2 3* 

derivatives (that is the integral of the square of the second 25 

derivatives exists [KeUcrHB. Numerical Meth^forTVo- a Terence approxi- 

Point Boundary Value Problems iBtauddL 1968] and itf ^ £*J Myte<&iv*L Nrie that the dm^ence 
[0,1] is the class of functions in H a [<U] that are zero at s=0 on hold$ at intemal mo6cs m me interval where 

and s=l. To see how this relates to finding a minimum ^ ^ referenced lie in the range I to N. Collecting like 
consider u(s) to be a local minimum and u(s)+«v(s) to be a tcrms together, (9) can be written as 
perturbation about the minimum that satisfies the same 

boundary conditions (ie v(0)=v(l>=0). a^^b^^^^^rfi 

dearly, considered as a function of c I(e)=Iu(s}fev($) is 
a minimum at e=0. Hence the derivative of Ife) must be zero 35 where 
at €=0. Equation (5) is therefore a necessary condition for a 
local minimum. Although solutions to (5) are not guaranteed ^_ 
to be minima for completely general edge strength M 
functions, it has been found in practice that solutions are ^ ^ ^ 

indeed minima. 40 frj== ""ji*~ + — — + 

Standard arguments in the calculus of variations show that 
problem (5) is equivalent to another problem, which is f + + _Jb_ ) 

simpler to solve: find a curve (x(s),?(s))c ClOahC^Oa ] I * * ** * * > 

that satisfies the pair of fourth order ordinary differential 45 2om ^ ^ 

equations JJJ 

-fr(-^fr)**l«-*-K*L-"" '-+* 

together with the boundary conditions Disaetising both the differential equations (6) and (7) and 

*<n.«ourn.*n «ven, uid 55 taking boundary conditions into account, the finite differ- 

mx»AMm g^en, e nce approximations x={x,} and y={y,} to {x,} and {y,}, 

#r x I I | _ ^ I =Q C 7 ) respectively, satisfy the following system of the algebraic 

U = <tf U~ L L equations 

The statement of the problem is for the case of a fixed-end ^ Jfcfei), JCx=*(5j> 
snake, but if the snake is to form a closed loop then the 

boundary conditions above are replaced by periodicity con- The structure of the matrices K and the right hand vectors 

ditions. Both of these problems can easily be solved using f and g are different depending on whether closed or open 

finite differences. snake boundary conditions are used. If the snake is closed 

The finite difference approach starts by cUscretising the 65 then fictitious nodes at S^. S^ A and S*+ 2 introduced 

interval [0,1] into N-l eqirispaced subintervals of length and the difference equation (9) is applied at nodes 0. 1. N-l 

h=l\(N-l) and defines a set of nodes and N. 
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Periodicity implies that x<px JVk X-r=*s.i* *n+i = *i and 
x 2 =^x N+2 . With these conditions in force the coefficient 
matrix becomes 



10 

-continued 

fern =*0,U,... 



*2 Cj «fc «2 

04 N ^4 ^4 



4w «w 



frAW d/f-i 
as b* cm 



10 



This system has to be solved for each n. For a closed-loop 
snake the matrix on the left hand side is difficult to invert 
directly because the terms that are outside the main diagonal 
band destroy the band structure. In general, the coefficient 
matrix K can be split into the sum of a banded matrix B plus 
a non-banded matrix A; K=A+B. For a fixed-end snake the 
matrix A would be zero. The system of equations is now 
solved for each n by performing the iteration 



and the right hand side vector is (f 4 . f 2 f N ) r 

For fixed-end snakes fictitious nodes at S 0 and are 
introduced and the difference equation (9) is applied at nodes 
S, and Sjv+j. Two extra difference equations are introduced 
to approximate the zero curvature boundary conditions; 



At 



=0 



( * + y + Y i> + «ten&> for *= 0 » 1 A • • • 

The matrix (B+I/y) is a band matrix and can be expressed 
20 as a product of Cholesky factors LL r (Johnson Reiss, ibid.]. 
The systems are solved at each stage by first solving 
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namely Xq-^+x^ and x_ 1 -2xyfx iV+1 =0. The coefficient 
matrix is now 



j(c2-a 2 ) <h n 
h C3 4> o 

^_ as bi cs ds es 
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followed by 



ON-2 bjh2 



\ 



(cm - Cft-l) 
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and the right hand side vector is 



The right hand side vector for the difference equations 
corresponding to (7) is derived in a similar fashion. 

The system of functions (10) represents a set of non-linear 
equations that has to be solved. The coefficient matrix is 
symmetric and positive definite, and banded for the fixed- 
end snake. For a closed-loop snake with periodic boundary 
conditions it is banded, apart from a few off-diagonal entries. 
As the system is non-linear it is solved iteratively. The 
iteration performed is 



T 



+ *»* = g(&& ) for * = 0, 1 A • . • 



where y>0 is a stabilisation parameter. This can be rewritten 
as 

( K+ Y / )*»i»Y- tor B »04A • • . 



Notice that the Cholesky decomposition only has to be 
performed once. 

Model-based coding schemes use 2-D or 3-D models of 
scene objects in order to reduce the redundancy in the 
information need&d to encode a moving sequence of 
images. The location and tracking of moving objects is of 
fundamental importance to this. Videoconference and video- 
phone type scenes may present difficulties for conventional 
40 machine vision algorithms as there can often be low contrast 
and 'fuzzy 1 moving boundaries between a person's hair and 
the background. Adaptive contour models or 'snakes' form 
a class of techniques which are able to locate and track 
object boundaries; they can fit themselves to low contrast 
boundaries and can fill in across boundary segments 
between which there is little or no local evidence for an 
edge. This paper discusses the use of snakes for isolating the 
head boundary in images as well as a technique which 
combines block motion estimation and the snake: the 
'power-assisted* snake. 

The snake is a continuous curve (possibly closed) which 
attempts to dynamically position itself from a given starting 
position in such a way that it clings to edges in the image. 
Full details of the implementation for both closed and 
•fixed-end' snakes are given in Waite J B, Welsh W J. "Head 
Boundary Location using Snakes"*, British Telecom Tech- 
nology Journal Vol 8. No 3, July 1990, which describes two 
alternative implementation strategies: finite elements and 
finite differences. We implemented both closed and fixed- 
60 end snakes using finite differences. The snake is initialised 
around the periphery of a head-and-shoulders image and 
allowed contract under its own internal elastic force. It is 
also acted on by forces derived from the image which are 
generated by first processing the image using a Laplacian- 
65 type operator with a large space constant the output of which 
is rectified and modified using a smooth non-linear function. 
The rectification results in isolating the •valley' features 
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which have been shown to correspond to the subjectively simpler to implement than an equivalent coarse-to-fine reso- 

important boundaries in facial images; the non-linear func- lution technique. The methods described in the paper have 

tion effectively reduces the weighting of strong edges rela- so far been tested in images where the object boundaries do 

tive to weaker edges in order to give the weaker boundaries not have very great or discontinuous curvature at any point; 

a better chance to influence the snake. After about 200 5 if these conditions are not met the snake Would fail to 

iterations of the snake it reaches the position hugging the conform itself correctly to the boundary contour. One 

boundary of the head. In a second example, a fixed-end solution, currently being pursued, is effectively split the 

snake with its end points at the bottom corners of the image boundaries into a number of shorter segments and fit these 

was allowed to contract in from the sides and top of the segments with several fixed-end snakes, 

image. The snake stabilises on the boundary between hair to According to a second aspect of the present invention a 

and background although this is a relatively low-contrast method of verifying the identity of the user of a data carrier 

boundary in the image. As the snake would face problems comprises: generating a digital facial image of the user, 

trying to contract across a patterned background, it might be receiving the data carrier and reading therefrom identifica- 

better to derive the image forces from a moving edge tion data; performing the method of the first aspect of the 

detector. 15 present invention; comparing each feature vector, or data 

In Kass et al ibid, an example is shown of snakes being derived therefrom* with the identification data; and gener- 

used to track the moving lips of a person. First the snake is ating a verification signal in dependence upon the compari- 

stabilised on the lips in the first frame of a moving sequence son. 

of images; in the second frame it is initialised in the position According to a yet further aspect of the present invention 

corresponding to its stable position in the previous frame 20 apparatus for verifying the identity of the user of a data 

and allowed to achieve equilibrium again. There is a clear carrier comprises: means fox generating a digital facial 

problem with the technique in this form in that if the motion image of the user; means tor receiving the data carrier and 

is too great between frames, the snake may lock on to reading therefrom identification data; means for r^rforming 

different features in the next frame and thus lose track. Kass the method of the first aspect of the present invention; and 

suggests a remedy using the principle of 'scale-space con- 25 means for comparing each feature vector, or data derived 

tinuation': the snake is allowed to stabilise first on an image merefrom, with the i den t ification data and generating a 

which has been smoothed using a Gaussian filter with a large verification signal in dependence upon the comparison, 

space constant; this has the effect of pulling the snake in 

from a large distance. After equilibrium has occurred, the BRIEF DESCRIPTION OF THE DRAWINGS 

snake is presented with a new set of image forces derived by 30 ^ embodiment of the invention will now be described, 

using a Gaussian with slightly smaller space constant and by way of example only, with reference to the accompanying 

the process is continued until equilibrium has occurred in the drawings m which- 

image at the highest level of resolution V™™*- rg. t is flow diagram of the calculation of a feature 

This is clearly a computationally expensive process; a * 

rurally simpler technique has been developed and found to 35 vector; 

work well and this will now be described After the snake FIG. 2 illustrates apparatus for credit verification using 

has reached equilibrium in the first frame of a sequence, the method of the present invention; and 

block motion estimation is carried out at the positions of the FIG. 3 illustrates the method of the present invention. 

snake nodes (the snake is conventionally implemented by 

approximating it with a set of discrete nodes— 24 in our 40 DETAILED DESCRIPTION OF THE 

implementation). The motion estimation is performed from INVENTION 

one frame into the next frame which is the opposite sense to Referring to FIG. 1. an overview of an embodiment 

that conventionally performed during motion compensation incorporating both aspects of the invention will be 

for video coding. If the best match positions for the blocks described. 

are dotted in the next frame then, due to the 'aperture 45 . t _ . . 

£c*lern' , a g<>od match can often be found at a ra^e of An image of the fan*, tee is czpt^in soM 

Joints alonglboundary segment which is longer than the a camcra * b ?J 

tfe of Ae block bd7maS TT* effect is to produce a photogra^^ dign^sed to provi^an amy of pixel 

nonuniform spacing of the points. THe snake is men initia- values. A head detection algonfcm is employed^ to locate Ae 

Used in the fraWwith its nodes at the positions of the best 50 P°^on within the array of the face or head. This head 

match block positions and allowed to run for a few stage may c«m^one of several toown n^ 

iterations, typically ten. The result is the nodes are now J* ^ Fcferably a me*od using the j above descri^ 

uniformly Muted along the boundary; the snake has '^ ^S™' ^ T *** 

successfully tracked the boundary, the block motion estima- thus determined are ignored. 

tion having acted as a sort of 'power-assist' which will 55 The second step is carried out on the pixel date lying 

ensure tracking is maintained as long as the overall motion inside the boundaries to locate the features to be used for 

is not greater than the maximum displacement of the block recognition— typically the eyes and mouth. Again, several 

search. As the motion estimation is performed at only a location techniques for finding the position of the eyes and 

small set of points, the computation time is not increased mouth are known from the prior art but preferably a two 

significantly 60 stage process of coarse location followed by fine location is 

Both fixed-end and closed snakes have been shown to employed. The coarse location technique might, for 

perform object boundary location even in situations where example, be that described in U.S. Pat. No. 4,841575. 

there may be a low contrast between the object and its The fine location technique preferably uses the deform- 

background. able template technique described by Yuiile et al "Feature 

A composite technique using bom block motion estima- 65 Extraction from Faces using Deformable Templates", Har- 
tion and snake fitting has been shown to perform boundary vard Robotics Lab Technical Report Number; 88/2 pub- 
tracking in a sequence of moving images. The technique is lished in Computer Vision and Pattern Recognition. June 
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1989 IEEE. In this technique which has been described the other points always being edge po^^^ekey point 

above,7une model topologkally equivalent to the feature is position d*a arc stored, and may <*o be used as recognition 

positioned (by the coarse location technique) near the fea- indioa. This is indicated ui HQ. 3. 

teratively moved and deformed until the best fit Nat. the geometrical transformation of the feature to fte 

Next, the shape of the feature is changed until it assumes facets ofthe regk)ns and me templates. The facets consist of 
a standard, topological^ equivalent shape. If the fine loca- loca] collections of three points and are defined in the 
tion technique utilised deformable templates as disclosed template definition files. The mapping is formed by consid- 
above. then the deformation of the feature can be achieved mc x , y values of template vertices with each of the 

to some extent by reversing the deformation of the template 10 X 'y values of the corresponding region vertices--mis yields 
to match the feature to the initial, standard shape of the two equations from which each x\y' point can be 
template. calculated given any x,y within the template facet and thus 

Since the exact position of the feature is now known, and the image data can be mapped from the region sub-image, 
its exact shape is specified, this information can be The entire template sub-image is obtained by rendering 
employed to identify the image and as a supplement to the 15 (or scan-converting) each constituent facet pel by pel, taking 
repetition process using the feature vector of the feature. the pel's value from the corresponding mapped location in 
All image data outside the region identified as being die the equivalent region's sub-image, 
feature are icnored and the image date identified as being the The processor 5 is arranged to perform the mapping of the 
feature aretesolved into its orthogonal eigen-picture com- ^ extracted region ^images to their corres^n^g ^nenc 
Swnte corresponding to that featire. Thecomponent vec- 20 template size and shape. The keypoints on 
£r*eS corresponding * triangular mesh with a corresponding mesh deW forUie 

Otis men wmipaicuwi © generic template shape; mappings are then formed from 

to a given person to be identified and u *e jvent of ^ ^ thT^eric^ mesh to its equivalent in the 
substantial similarity, recognition may be indicated. ™. ^ ^ ^^ted sub-images arelen created and 

Referring to FIG. 2, an embodiment of the invention ^ stQrcd k me template data structures for later display, 
suitable for credit card verification will now be described. ^ cMn ^ p,^^^ m this module is the 'template 

A video camera 1 receives an image of a prospective user stretching* procedure. This routine creates each distorted 
of a credit card terminal. Upon entry of the card to a card template sub-image facet by facet (each facet is defined by 
entry device 2, the analogue output of the video camera 1 is three connected template points). A mapping is obtained 
digitised by an AD converter 3, and sequentially clocked 30 from each template facet to the corresponding region facet 
into a framestore 4. A video processor 5 (for example, a and then the template face is filled in pel by pet with image 
suitable processed digital signal processing chip such as that data mapped from the region sub-image. After all facets 
AT&T DSP 20) is connected to access the framestore 4 and have been processed in this way the distorted template 
processes the digital image therein to form and edge- sub-image will have been completely filled in with image 
enhanced image. One method of doing mis is simply to 35 data. u . . . 

subtract each sample from its predecessor to form a differ- The standardised feature image thus produced is then 
ence picture, but a better method involves the use of a stored in a feature image buffer 7. ^ ^gen^ictuxe buffer 8 
Apiarian type of operator, the output of which is modified contains a plurality (^example 50 * * 
byY*^ function which sujpresses small levels of maeasing sequency whichhave ^n^^f^^ 
aV^ue noise as well. J?j«. 5 edges^fist „ ^^u^ 

leaving intermediate values W By Z^SS^c^tnmt^ processor 9 (which may 

a smoother edge image is generated, and weak edge contours ^ ^ ^ ^ sor ^ ^ under suitable 

such as those around the line of the chin are enhanced This gt0 ^ ^^^5 } derfves the co-ordinates or components 
edge picture is stored in an edge picture frame buffer 6. me Qf ^ featurc ^ regard to each eigen-picture. to 

processor then executes a closed loop snake method, using 45 . e a vcctor of 50 numbers, using the method described 
finite differences, to derive a boundary which encompasses The card entry device 2 reads from the inserted credit 

the head. Once the snake algorithm has converged, the CS[d me ^ components which characterise the correct user 
position of the boundaries of the head in the edge image and of ^ ca[ ^ which are input to a comparator 10 (which may 
hence the corresponding image in the frame store 4 is now m practice be realised as part of a single processing 

in force. 30 device) which measures the distance in pattern space 

The edge image in the framestore 6 is then processed to between the two connectors. The preferred metric is the 
derive a coarse approximation to the location of the features Euclidian distance, although other distance retries (eg. "a 
of mterest— typically the eyes and the mouth. The method of city block" metric) could equally be used. If this distance is 
Nagao is one suitable technique (Nagoa M, "Picture Rec- less than a predetermined threshold, correct recognition is 
ognition and Data Structure", Graphic Languages-^ 55 indicated to an output 11; otherwise, recognition failure is 
Rosenfield) as described in our earlier application signalled. 

EP0225729. The estimates of position thus derived are used Other data may also be mcorporated into the recognition 
as starting positions for the dynamic template process which process; for example, data derived during template 
establishes the exact feature position. deformation, or head measurements (eg. the ratio of head 

Accordingly, processor 5 employs the method described 60 height to head width derived during the head location stage) 
in Yuflle etaT [Yuffle A, Cohen D, Hallinan P, (1988), "Facial or the feature position data as mentioned above. Recognition 
Featurc Extraction by Deformable Templates". Harvard results may be combined in the manner indicated in our 
Robotics Lab. Technical Report no. 88-2] to derive position earlier application OB9005 190.5 . 

data for each feature which consists of a size (or resolution) Generally, some preprocessing of the image is provided 
and a series of point coordinates given as fractions of the 63 (indicated schematically as 12 in FIG. 2); for example, noise 
total size of the template. Certain of these points are desig- filtering (spatial or temporal) and brightness or contrast 
nated as keypoints which are always Internal to the template. normalisation. 
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Variations in lighting can produce a spatially variant effect 
on the image brightness due to shadowing by the brows etc. 
It may be desirable to further pre-process the images to 
remove most of this variation by using a second derivative 
operator or morphological filter in place of the raw image 
data currently used. A blurring filter would probably also be 
required. 

It might also be desirable to reduce the effects of varia- 
tions in geometric normalisation on the representation vec- 
tors. This could be accomplished by using low-pass filtered 
images throughout which should give more stable represen- 
tations for recognition purposes. 

We claim: 

1. A method of processing an image represented by an 



10 



wherein the step of locating within the digital signal 
image the position of at least one predetermined feature 
employs: 

a first technique to provide a coarse estimation of position, 
and 

a second, different, technique to improve upon the coarse 
estimation. 

4. A method according to claim 3 wherein the second 
technique uses a deformable template. 

5. A method according to claim 1 in which the training set 
of digital signal images of like features are modified to 
normalize the shape of each of the training set of digital 
signal images to reduce deviation from a predetermined 



array of digital signal s^didhod^priSng the steps off 15 standard shape of features and subsequently the reference 
iLti^ZZ the image array of digital signals the vecto Rectors of * e «* of dujital signal 



position of at least one predetermined feature; 
extracting image data from said image array of digital 

signals representing each said feature; 
generating for each feature a feature vector digital signal 20 
representing the position of the image data of the 
feature in an N-dimensional space, N being an integer 
and said space being defined by a plurality of reference 
vector signals each of which is an eigenvector of a 
training set of images of like features; and 25 
modifying the digital signal image data of each feature to 
normalize the shape of each feature thereby to reduce 
its deviation from a rredeterrnined standard shape of 
said feature, which step is carried out before generating 
the feature vector signal for each feature. 

2. A method of processing an image represented by an 
array of digital signals, said method comprising the steps of: 

locating within the image array of digital signals the 
position of at least one predetermined feature; 

extracting image data from said image array of digital 
signals representing each said feature; 

generating for each feature a feature vector digital signal 
representing the position of the image data of the 
feature in an N-dimensional space, N being an integer 
and said space being defined by a plurality of reference 
vector signals each of which is an eigenvector of a 
training set of images of like features; and 

modifying the digital signal image data of each feature to 
normalize the shape of each feature thereby to reduce 
its deviation from a prcdetennincd standard shape of 
said feature, which step is carried out before generating 
the feature vector signal for each feature, 

wherein the step of modifying the digital signal image 
data of each feature uses a deformable template. 

3. A method of processing an image represented by an 
array of digital signals, said method comprising the steps of: 

locating within the image array of digital signals the 
position of at least one predetennined feature; 

extracting image data from said image array of digital 55 
signals representing each said feature; 

generating for each feature a feature vector digital signal 
representing the position of the image data of the 
feature in an N-dimensional space, N being an integer 
and said space being defined by a plurality of reference 60 
vector signals each of which is an eigenvector of a 
training set of images of like features; and 

rnodirying the digital signal image data of each feature to 
normalize the shape of each feature thereby to reduce 
its deviation from a predetermined standard shape of 65 
said feature, which step is carried out before generating 
the feature vector signal for each feature. 
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images are generated. 

6. A method according to claim 1 wherein said locating 
step comprises: 

locating a portion of the digital signal image by deter- 
mining parameters of a closed curve arranged to lie 
adjacent a plurality of edge features of the digital signal 
image, said curve being constrained to exceed a mini- 
mum curvature and to have a minimum length com- 
patible therewith. 

7. A method as in claim 6 in which the curve is initially 
generated proximate the plurality of edge features of the 
digital signal image, and subsequently interactively 
re-generated to reduce its distance from said edge features. 

8. A method according to claim 6 in which the located 
portion of the digital signal image is a face or a head of a 
person. 

9. A method of verifying the identity of the user of a data 
carrier, said method comprising the steps of: 

generating a digital signal facial image of the user; 
receiving the data carrier and reading therefrom identifi- 
cation data signals; 
performing on the digital signal racial image the steps of: 

(i) locating within the digital signal image the position 
of at least one predetermined feature; 

(ii) extracting digital signal image data from said image 
representing each said feature; 

(Hi) generating for each feature a feature vector signal 
representing the position of the image data of the 
feature in an N-dimensional space, N being an inte- 
ger and said space being defined by a plurality of 
reference vectors signals each of which is an eigen- 
vector of a training set of images of like features; and 

(iv) modifying the digital signal image data of each 
feature to normalize the shape of each feature 
thereby to reduce its deviation from a predetermined 
standard shape of said feature, which step is carried 
out before generating the feature vector for each 
feature; 

comparing data signals representing each feature vector 
with the identification data signals; and 

generating a verification signal in dependence upon the 
comparison. 

10. Apparatus for verifying the identity of the user of a 
data carrier comprising: 

means for generating a digital signal facial image of the 
user; 

means for receiving the data carrier and reading therefrom 

identification data signals; 
means for performing on the digital signal facial image 

the steps of: 
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(i) locating within the digital signal image and position 
of at least one predetermined feature; 

(ii) extracting digital signal image data from said image 
representing each said feature; 

(iii) generating for each feature a feature vector signal 5 
representing the position of the image data of the 
feature in an N-dimensional space, N being an inte- 
ger and said space being defined by a plurality of 
reference vector signals each of which is an eigen- 
vector of a training set of images of like features ; and to 

(iv) modifying the digital signal image data of each 
feature to normalize the shape of each feature 
thereby to reduce its deviation from a predetermined 
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standard shape of said feature, which step is carried 
out before calculating the feature vector for each 
feature; and 

means for comparing data signals representing each fea- 
ture vector with the identification data signals and 
generating a verification signal in dependence upon the 
comparison. 

11. Apparatus as in daim 10 in which the means for 
generating a digital signal facial image of the user comprises 
a video camera, the output of which is connected to an AD 
convenor. 

***** 
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